Find a duplicate, Space Edition™ BEAST MODE
In Find a duplicate, Space Edition™, we were given a list of integers where:
These properties mean the list must have at least 1 duplicate. Our challenge was to find a duplicate number without modifying the input and optimizing for space. We used a divide and conquer approach, iteratively cutting the list in half to find a duplicate integer in time and space (sort of a modified binary search).
But we can actually do better. We can find a duplicate integer in time while keeping our space cost at .
This is a tricky one to derive (unless you have a strong background in graph theory), so we'll get you started:
Imagine each item in the list as a node in a linked list. In any linked list, each node has a value and a "next" pointer. In this case:
Here’s a full example:
Notice we're using "positions" and not "indices." For this problem, we'll use the word "position" to mean something like "index," but different: indices start at 0, while positions start at 1. More rigorously: position = index + 1.
Using this, find a duplicate integer in time while keeping our space cost at . Just like before, don't modify the input.
Drawing pictures will help a lot with this one. Grab some paper and pencil (or a whiteboard, if you have one).
We treat the input list as a linked list like we described at the top in the problem.
To find a duplicate integer:
We want to think of our list as a linked list but we don't want to actually use up all that space, so we traverse our list as if it were a linked list by converting positions to indices.
def find_duplicate(int_list):
n = len(int_list) - 1
# STEP 1: GET INSIDE A CYCLE
# Start at position n+1 and walk n steps to
# find a position guaranteed to be in a cycle
position_in_cycle = n + 1
for _ in range(n):
position_in_cycle = int_list[position_in_cycle - 1]
# we subtract 1 from the current position to step ahead:
# the 2nd *position* in a list is *index* 1
# STEP 2: FIND THE LENGTH OF THE CYCLE
# Find the length of the cycle by remembering a position in the cycle
# and counting the steps it takes to get back to that position
remembered_position_in_cycle = position_in_cycle
current_position_in_cycle = int_list[position_in_cycle - 1] # 1 step ahead
cycle_step_count = 1
while current_position_in_cycle != remembered_position_in_cycle:
current_position_in_cycle = int_list[current_position_in_cycle - 1]
cycle_step_count += 1
# STEP 3: FIND THE FIRST NODE OF THE CYCLE
# Start two pointers
# (1) at position n+1
# (2) ahead of position n+1 as many steps as the cycle's length
pointer_start = n + 1
pointer_ahead = n + 1
for _ in range(cycle_step_count):
pointer_ahead = int_list[pointer_ahead - 1]
# Advance until the pointers are in the same position
# which is the first node in the cycle
while pointer_start != pointer_ahead:
pointer_start = int_list[pointer_start - 1]
pointer_ahead = int_list[pointer_ahead - 1]
# Since there are multiple values pointing to the first node
# in the cycle, its position is a duplicate in our list
return pointer_start
time and space.
Our space cost is because all of our additional variables are integers, which each take constant space.
For our runtime, we iterate over the array a constant number of times, and each iteration takes time in its worst case. So we traverse the linked list more than once, but it's still a constant number of times—about 3.
There another approach using randomized algorithms that is time and space. Can you come up with that one? (Hint: You'll want to focus on the median.)
This one's pretty crazy. It's hard to imagine an interviewer expecting you to get all the way through this question without help.
But just because it takes a few hints to get to the answer doesn't mean a question is "too hard." Some interviewers expect they'll have to offer a few hints.
So if you get a hint in an interview, just relax and listen. The most impressive thing you can do is drop what you're doing, fully understand the hint, and then run with it.
Do you have an answer?
Wanna review this one again later? Or do you feel like you got it all?
Mark as done Pin for review laterReset editor
Powered by qualified.io